Multi-Cache Profiling of Parallel Processing Programs Using Simics
نویسنده
چکیده
This paper presents a multi-cache profiler for shared memory multiprocessor systems. For each program’s static data structure, the profiler outputs the readand write-miss frequencies that are due to cache line migrations. Those program’s static data structures, which their manipulations, result in excessive cache line migrations—potentially a source for excessive falsemisses—are identified. The frequency of line migrations from cache to cache may inherently depend on the algorithm or may depend on the coding. The paper illustrates that our profiled data can be useful for analyzing algorithms and programs from cache performance points of views as well as for code optimizations to reduce cache line migrations. This profiler is created by extending Simics single-cache profiling capabilities. It combines the profiled data from the individual cache memories and the virtual memory addresses assigned to each of the static data structures of a parallel processing program.
منابع مشابه
Nahalal: Memory Organization for Chip Multiprocessors
-This paper addresses cache organization in Chip Multiprocessor (CMPs). We introduce Nahalal, a novel nonuniform cache (NUCA) topology that enables fast access to shared data for all processors, while preserving the vicinity of private data to each processor. Our characterization of memory accesses patterns in typical parallel programs shows that such a topology is appropriate for common multi-...
متن کاملMemory Performance Analysis for Parallel Programs Using Concurrent Reuse Distance
Performance on multicore processors is determined largely by on-chip cache. Computer architects have conducted numerous studies in the past that vary core count and cache capacity as well as problem size to understand impact on cache behavior. These studies are very costly due to the combinatorial design spaces they must explore. Reuse distance (RD) analysis can help architects explore multicor...
متن کاملWind River Simics for Multi-core Systems Development
The hardware shift to multi-core processors and multiprocessor systems calls for new software and systems development tools to help developers transform their code into parallel applications and gain performance increases. Developers now have to know how to create software and architect systems that can use parallel hardware efficiently. Virtualized systems development is a development methodol...
متن کاملProfiling EEMBC MultiBench Programs using Full-system Simulations
This paper presents the profiling of EEMBC MultiBench programs. We executed 16 parallel benchmark workloads on M5 simulator. The target system contains 64 dual-issue cores running at 2 GHz. Each core has 16 KB I-cache and 16 KB Dcache. The cores share a total of 16 × 1 MB L2 caches through a 64 Byte wide L1-to-L2 bus running at 1 GHz. We measure the application performance (instruction-per-cycl...
متن کاملSimICS/Sun4m: A Virtual Workstation
System level simulators allow computer architects and system software designers to recreate an accurate and complete replica of the program behavior of a target system, regardless of the availability, existence, or instrumentation support of such a system. Applications include evaluation of architectural design alternatives as well as software engineering tasks such as traditional debugging and...
متن کامل